02 June 2014

How many 5's can you find?

Proximity

Proximity

Repetition

Contrast

Subtraction

Making Sense of Data

"The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, … because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it."

Hal Varian, Google’s Chief Economist

Focus their Attention

“What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention, and a need to allocate that attention efficiently among the overabundance of information sources that might consume it.”

Herb Simon

Modes of Thinking

  • Writing (Verbal)
  • Symbolic (Math-logic)
  • Geometric (Visual)
  • Interactive (Kinesthetic)

Writing (Verbal)

Pythagoras' Theorem

The Pythagoras' theorem is a relation in Euclidean geometry among the three sides of a right triangle. It states that the square of the hypotenuse (the side opposite the right angle) is equal to the sum of the squares of the other two sides.

Symbolic (Math-logic)

Pythagoras' Theorem

For all \(\triangle XYZ\), where \(\angle XYZ = 90^\circ\) and length of side \(XY = a\), \(YZ = b\) and \(XZ = c\), there exist a relationship such that:

\(a^2 + b^2 = c^2\)

Geometric (Visual)

Pythagoras' Theorem

Interactive (Kinesthetic)

Pythagoras' Theorem

Visual Power of the Human Brain

Pattern Recognition Machine

Simple

  • Driving on the Road
  • Facial Expression
  • Face Recognition

Complex

  • CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart)
  • Chess and Go
  • Metrological Forecasts

Visualization

/ˌvɪʒʊəlaɪˈzeɪʃən / (noun)

Derived from the Latin verb videre, "to look, to see"

"The act or instance to form a mental image or picture (without an object) or… the act or instance to make visible or visual (with an object)"

“Transformation of the symbolic into the geometric” - McCormick et al. 1987

“The use of computer-generated, interactive, visual representations of abstract data to amplify cognition.” - Card, Mackinlay, & Shneiderman 1999

The Value of Visualization

Graphical Inference

William Playfair (1786)

Johann Lambert (1786)

Graph of rate of evaporation of water vs. temperature

John Snow (1854)

Florence Nightingale (1857)

Charles Minard (1869)

Jacques Bertin (1967)

Book: Sémiologie graphique / Semiology of Graphics

Visual language is a sign language

  • Images perceived as a set of signs
  • Sender encodes information in signs
  • Receiver decodes information from signs

“… finding the artificial memory that best supports our natural means of perception.”

Jacques Bertin (1967)

  • A, B, C are distinguishable
  • B is between A and C.
  • BC is twice as long as AB.

∴ Encode quantitative variables

"Resemblance, order and proportion are the three signifieds in graphics.” - Bertin

Francis Anscombe (1973)

Anscombe's Quartet

anscombe
##    x1 x2 x3 x4    y1   y2    y3    y4
## 1  10 10 10  8  8.04 9.14  7.46  6.58
## 2   8  8  8  8  6.95 8.14  6.77  5.76
## 3  13 13 13  8  7.58 8.74 12.74  7.71
## 4   9  9  9  8  8.81 8.77  7.11  8.84
## 5  11 11 11  8  8.33 9.26  7.81  8.47
## 6  14 14 14  8  9.96 8.10  8.84  7.04
## 7   6  6  6  8  7.24 6.13  6.08  5.25
## 8   4  4  4 19  4.26 3.10  5.39 12.50
## 9  12 12 12  8 10.84 9.13  8.15  5.56
## 10  7  7  7  8  4.82 7.26  6.42  7.91
## 11  5  5  5  8  5.68 4.74  5.73  6.89

Anscombe's Quartet

Computing the Stats

Mean

\(\mu_x = 9\); \(\mu_y = 7.5\)

Variance and Correlation

\(\sigma^2_x = 11\); \(\sigma^2_x = 4.1\); \(cor(x,y) = 0.816\)

Linear Regression

\(y = 3.00 + 0.500x\)

\(R^2 = 0.667\)

Anscombe's Quartet

Plot the Relationship

John Tukey (1977)

Exploratory Data Analysis: An approach to analyze data sets to summarize their main characteristics, often with visual methods

Edward Tufte (1983)

Book: The Visual Display of Quantiative Information

“Above all else, show the data.”

Data-Ink ratio = data-ink / total-ink used in graphics

Improve data-ink ratio

Edward Tufte (1983)

Don't lie with statistics

William Cleveland (1985)

Book: Element of Graphing Data

"The important criterion for a graph is not simply how fast we can see a result; rather it is whether through the use of the graph we can see something that would have been harder to see otherwise or that could not have been seen at all."

  • A graphic should display as much information as it can, with the lowest possible cognitive strain to the viewer.
  • Visualization is an iterative process. Graph the data, learn what you can, and then regraph the data to answer the questions that arise from your previous graphic.

William Cleveland (1985)

Dot Plots

A Thirty Year Comparison

Aspects Macintosh MacBook Change
Year 1984 2014 +30
Cost $2,500 $999 2/5x
Speed 8MHz 1.4GHz 175x
Memory 128KB 4GB 30,000x
Pixels 512 x 342 1440 x 900 7.4x
Screen 72PPI (9in) 128PPI (13.3in) 1.8x

Leland Wilkinson (1999)

Book: The Grammar of Graphics

Grammar: “the fundamental principles or rules of an art or science”

"…rules for constructing graphs mathematically and then representing them as graphics aesthetically."

Three metaphors for thinking about visualization

  • Canvas (Sketch it out)
  • Graphics (Objects to constructs graphs)
  • Charts (Catalog of chart types)

R Graphics Package

  • Base Graphics: Written by Ross Ihaka based on experience from S graphics. A pen on paper model and there is no (user accessible) representation of the graphics. Base graphics functions are generally fast, but have limited scope.

  • grid graphics: Developed by Paul Murrell (2000), Grid grobs (graphical objects) can be represented independently of the plot and modified later. Grid provides drawing primitives, but no tools for producing statistical graphics.

  • lattice: Developed by Deepayan Sarkar (2008), uses grid graphics to implement the trellis graphics system of Cleveland. You can easily produce conditioned plots but it lacks a formal model

  • ggplot2: Developed by Hadley Wickam (2007), takes the good things of lattice with the underlying layered grammar of graphics approach. Easy to draw wide range of graphics with compact syntax and independent components

The layered graphics of grammar (1/4)

The layered graphics of grammar (2/4)

The layered graphics of grammar (3/4)

The layered graphics of grammar (4/4)

ggplot2: Layered grammar of graphics

  • data: The data that you want to visualise.
  • aes: A set of aesthetic mappings describing how variables in the data are mapped to aesthetic attributes that you can perceive.
  • geom: Geometric objects represent what you actually see on the plot: points, lines, polygons, etc.
  • stat: Statistical transformations summarise data in many useful ways. For example, binning and counting observations to create a histogram, or summarising a 2d relationship with a linear model.

ggplot2: Layered grammar of graphics

  • scales: The scales map values in the data space to values in an aesthetic space, whether it be colour, or size, or shape. Scales draw a legend or axes, which provide an inverse mapping to make it possible to read the original data values from the graph.
  • coord: A coordinate system describes how data coordinates are mapped to the plane of the graphic. It also provides axes and gridlines to make it possible to read the graph. We normally use a Cartesian coordinate system, but a number of others are available, including polar coordinates and map projections.
  • facet: A faceting specification describes how to break up the data into subsets and how to display those subsets as small multiples. This is also known as conditioning or latticing/trellising.

Getting started with ggplot2

install.packages('ggplot2')
library(ggplot2)

Basic Syntax

Main arguments

  • data set, usually a data.frame
  • aesthetic mappings provided by aes function

General ggplot syntax

ggplot(data, aes(…)) + geom_x() + … + stat_x + …

Layer specifications

  • geom_*(mapping, data, …, geom, position)
  • stat_*(mapping, data, …, stat, position)

Additional components: scales, coordinates, facet

Diamonds Dataset

data(diamonds)
names(diamonds)
##  [1] "carat"   "cut"     "color"   "clarity" "depth"   "table"   "price"  
##  [8] "x"       "y"       "z"
head(diamonds)
##   carat       cut color clarity depth table price    x    y    z
## 1  0.23     Ideal     E     SI2  61.5    55   326 3.95 3.98 2.43
## 2  0.21   Premium     E     SI1  59.8    61   326 3.89 3.84 2.31
## 3  0.23      Good     E     VS1  56.9    65   327 4.05 4.07 2.31
## 4  0.29   Premium     I     VS2  62.4    58   334 4.20 4.23 2.63
## 5  0.31      Good     J     SI2  63.3    58   335 4.34 4.35 2.75
## 6  0.24 Very Good     J    VVS2  62.8    57   336 3.94 3.96 2.48

Diamonds - Description

?diamonds

A data frame with 53940 rows and 10 variables

  • price: price in US dollars ($326–$18,823)
  • carat: weight of the diamond (0.2–5.01)
  • cut: quality of the cut (Fair, Good, Very Good, Premium, Ideal)
  • colour: diamond colour, from J (worst) to D (best)
  • clarity: a measurement of how clear the diamond is (I1 (worst), SI1, SI2, VS1, VS2, VVS1, VVS2, IF (best))
  • x: length in mm (0–10.74)
  • y: width in mm (0–58.9)
  • z: depth in mm (0–31.8)
  • depth: total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79)
  • table: width of top of diamond relative to widest point (43–95)

Diamonds - Photo

Diamonds - Structure

str(diamonds)
## 'data.frame':    53940 obs. of  10 variables:
##  $ carat  : num  0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
##  $ cut    : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3 ...
##  $ color  : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 2 6 7 7 6 5 2 5 ...
##  $ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3 4 5 ...
##  $ depth  : num  61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
##  $ table  : num  55 61 65 58 58 57 57 55 61 61 ...
##  $ price  : int  326 326 327 334 335 336 336 337 337 338 ...
##  $ x      : num  3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
##  $ y      : num  3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
##  $ z      : num  2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...

Explore One Variable Distribution

Categorical

Bar, Stacked, CoxComb, Pie, Bulls-eye

Continuous Variables

Histogram, BoxPlot

Clarity - Summary

summary(diamonds$clarity)
##    I1   SI2   SI1   VS2   VS1  VVS2  VVS1    IF 
##   741  9194 13065 12258  8171  5066  3655  1790

Clarity - Bar Chart

ggplot(diamonds, aes(clarity)) +
  geom_bar()

Clarity - Bar Chart (with fill)

ggplot(diamonds, aes(clarity, fill=clarity)) +
  geom_bar()

Clarity - Bar Chart (with fill, no gap)

ggplot(diamonds, aes(clarity, fill=clarity)) +
  geom_bar(width = 1)

Clarity - Coxcomb Chart

ggplot(diamonds, aes(clarity, fill=clarity)) +
  geom_bar(width = 1) + coord_polar()

Clarity - Stacked Bar Chart

ggplot(diamonds, aes(x="", fill=clarity)) +
  geom_bar()

Clarity - Pie Chart

ggplot(diamonds, aes(x= "", fill=clarity)) +
  geom_bar() + coord_polar(theta = "y")

Clarity - Bullseye Chart

ggplot(diamonds, aes(x= "", fill=clarity)) +
  geom_bar(width = 1) + coord_polar(theta = "x")

Price - Summary

summary(diamonds$price)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     326     950    2400    3930    5320   18800

Price - Histogram

ggplot(diamonds, aes(price)) +
  geom_histogram()
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

Price - Histogram (Binwidth = 500)

ggplot(diamonds, aes(price)) +
  geom_histogram(binwidth = 500)

Price - Histogram (Binwidth = 50)

ggplot(diamonds, aes(price)) +
  geom_histogram(binwidth = 50)

Price - Histogram (Binwidth = 50, fill = count)

ggplot(diamonds, aes(price, fill=..count..)) +
  geom_histogram(binwidth = 50)

Price - Box-Plot

ggplot(diamonds, aes("", price)) +
  geom_boxplot()

Price - Box-Plot

ggplot(diamonds, aes(x="", price)) +
  geom_boxplot() + coord_flip()

Exploring Two or More Variables

Categorical vs. Categorical

Stacked Bar, Mosaic

Continuous vs. Categorical

Histogram - Aesthetics, Facets, Frequency Polygon, Density

Continuous vs. Continuous

Scatterplot - Aesthetics, Facets

Cut vs. Clarity - Summary

by(diamonds$cut, diamonds$clarity, summary)
## diamonds$clarity: I1
##      Fair      Good Very Good   Premium     Ideal 
##       210        96        84       205       146 
## -------------------------------------------------------- 
## diamonds$clarity: SI2
##      Fair      Good Very Good   Premium     Ideal 
##       466      1081      2100      2949      2598 
## -------------------------------------------------------- 
## diamonds$clarity: SI1
##      Fair      Good Very Good   Premium     Ideal 
##       408      1560      3240      3575      4282 
## -------------------------------------------------------- 
## diamonds$clarity: VS2
##      Fair      Good Very Good   Premium     Ideal 
##       261       978      2591      3357      5071 
## -------------------------------------------------------- 
## diamonds$clarity: VS1
##      Fair      Good Very Good   Premium     Ideal 
##       170       648      1775      1989      3589 
## -------------------------------------------------------- 
## diamonds$clarity: VVS2
##      Fair      Good Very Good   Premium     Ideal 
##        69       286      1235       870      2606 
## -------------------------------------------------------- 
## diamonds$clarity: VVS1
##      Fair      Good Very Good   Premium     Ideal 
##        17       186       789       616      2047 
## -------------------------------------------------------- 
## diamonds$clarity: IF
##      Fair      Good Very Good   Premium     Ideal 
##         9        71       268       230      1212

Cut vs. Clarity - Stacked Bar Chart

ggplot(diamonds, aes(x=cut, fill=clarity)) +
  geom_bar()

Cut vs. Clarity - Stacked Bar Chart

ggplot(diamonds, aes(x=cut, fill=clarity)) +
  geom_bar(position = "dodge")

Cut vs. Clarity - Stacked Bar Chart - 100%

ggplot(diamonds, aes(x=cut, fill=clarity)) +
  geom_bar(position = "fill")

Cut vs. Clarity - Mosaic Plot

No direct function - But you can easily write it

ggMMplot <- function(var1, var2){
  require(ggplot2)
  levVar1 <- length(levels(var1))
  levVar2 <- length(levels(var2))

  jointTable <- prop.table(table(var1, var2))
  plotData <- as.data.frame(jointTable)
  plotData$marginVar1 <- prop.table(table(var1))
  plotData$var2Height <- plotData$Freq / plotData$marginVar1
  plotData$var1Center <- c(0, cumsum(plotData$marginVar1)[1:levVar1 -1]) +
    plotData$marginVar1 / 2

  ggplot(plotData, aes(var1Center, var2Height)) +
    geom_bar(stat = "identity", aes(width = marginVar1, fill = var2), col = "Black") +
    geom_text(aes(label = as.character(var1), x = var1Center, y = 1.05))}

Cut vs. Clarity - Mosaic Plot

ggMMplot(diamonds$cut, diamonds$clarity)
## Warning: position_stack requires constant width: output may be incorrect

Price vs. Cut - Summary

by(diamonds$price, diamonds$cut, summary)
## diamonds$cut: Fair
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     337    2050    3280    4360    5210   18600 
## -------------------------------------------------------- 
## diamonds$cut: Good
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     327    1140    3050    3930    5030   18800 
## -------------------------------------------------------- 
## diamonds$cut: Very Good
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     336     912    2650    3980    5370   18800 
## -------------------------------------------------------- 
## diamonds$cut: Premium
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     326    1050    3180    4580    6300   18800 
## -------------------------------------------------------- 
## diamonds$cut: Ideal
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     326     878    1810    3460    4680   18800

Price vs. Cut - Histogram and Aesthetic

ggplot(diamonds, aes(price, fill=cut)) +
  geom_bar(binwidth = 500)

Price vs. Cut - Histogram and Facet

ggplot(diamonds, aes(price, fill=cut)) +
  geom_bar(binwidth = 500) + facet_wrap(~ cut)

Price vs. Cut - Histogram, Facet, scales free

ggplot(diamonds, aes(price, fill=cut)) +
  geom_bar(binwidth = 500) + facet_wrap(~ cut, scales="free")

Price vs. Cut - Frequency Polygon

ggplot(diamonds, aes(price, color = cut)) +
  geom_freqpoly(binwidth = 500)

Price vs. Cut - Frequency Polygon of Density

ggplot(diamonds, aes(price, ..density.., color=cut)) +
  geom_freqpoly(binwidth = 500)

Price vs. Cut - Histogram and Density

ggplot(diamonds, aes(price, ..density.. , fill=cut)) +
  geom_bar(binwidth = 500) + facet_wrap(~ cut)

Cut vs. Price - Summary

ggplot(diamonds, aes(cut, price, color = cut)) + 
  geom_point()

Cut vs. Price - Summary

ggplot(diamonds, aes(cut, price, color = cut)) + 
  geom_jitter()

Price vs Carat - Scatter Plot

ggplot(data = diamonds, aes(carat, price)) +
  geom_point()

Price vs Carat - Scatter Plot, Size

ggplot(data = diamonds, aes(carat, price)) +
  geom_point(size = 1)

Price vs Carat - Scatter Plot, Alpha

ggplot(data = diamonds, aes(carat, price)) +
         geom_point(alpha = I(1/20))

Price vs Carat - Scatter Plot, Jitter

ggplot(data = diamonds, aes(carat, price)) +
         geom_jitter()

Price vs Carat - Scatter Plot, Smooth

ggplot(diamonds, aes(carat, price)) +
         geom_point() + geom_smooth()
## geom_smooth: method="auto" and size of largest group is >=1000, so using gam with formula: y ~ s(x, bs = "cs"). Use 'method = x' to change the smoothing method.

Price vs Carat - Scatter Plot, Axis Limit (Zoom)

ggplot(diamonds, aes(carat, price)) +
  xlim(c(0, 3.1)) + geom_point()
## Warning: Removed 14 rows containing missing values (geom_point).

Price vs Carat - Scatter Plot, Axis Transform

ggplot(diamonds, aes(carat, price)) +
  scale_y_log10() + geom_point()

Price vs Carat vs Cut - Scatter Plot, Aesthetics

ggplot(diamonds, aes(carat, price, color=cut)) +
   geom_point()

Price vs Carat vs Cut - Scatter Plot, Facet

ggplot(diamonds, aes(carat, price, color=cut)) +
   geom_point(size=1) + facet_wrap(~ cut)

Replicating Minard's Chart

Minard's Chart - Troops

## troops <- read.table(url("http://amitkaps.com/data/minard-troops.txt"), header = TRUE)
troops <- read.table("minard-troops.txt", header = TRUE)
troops
##    long  lat survivors direction group
## 1  24.0 54.9    340000         A     1
## 2  24.5 55.0    340000         A     1
## 3  25.5 54.5    340000         A     1
## 4  26.0 54.7    320000         A     1
## 5  27.0 54.8    300000         A     1
## 6  28.0 54.9    280000         A     1
## 7  28.5 55.0    240000         A     1
## 8  29.0 55.1    210000         A     1
## 9  30.0 55.2    180000         A     1
## 10 30.3 55.3    175000         A     1
## 11 32.0 54.8    145000         A     1
## 12 33.2 54.9    140000         A     1
## 13 34.4 55.5    127100         A     1
## 14 35.5 55.4    100000         A     1
## 15 36.0 55.5    100000         A     1
## 16 37.6 55.8    100000         A     1
## 17 37.7 55.7    100000         R     1
## 18 37.5 55.7     98000         R     1
## 19 37.0 55.0     97000         R     1
## 20 36.8 55.0     96000         R     1
## 21 35.4 55.3     87000         R     1
## 22 34.3 55.2     55000         R     1
## 23 33.3 54.8     37000         R     1
## 24 32.0 54.6     24000         R     1
## 25 30.4 54.4     20000         R     1
## 26 29.2 54.3     20000         R     1
## 27 28.5 54.2     20000         R     1
## 28 28.3 54.3     20000         R     1
## 29 27.5 54.5     20000         R     1
## 30 26.8 54.3     12000         R     1
## 31 26.4 54.4     14000         R     1
## 32 25.0 54.4      8000         R     1
## 33 24.4 54.4      4000         R     1
## 34 24.2 54.4      4000         R     1
## 35 24.1 54.4      4000         R     1
## 36 24.0 55.1     60000         A     2
## 37 24.5 55.2     60000         A     2
## 38 25.5 54.7     60000         A     2
## 39 26.6 55.7     40000         A     2
## 40 27.4 55.6     33000         A     2
## 41 28.7 55.5     33000         A     2
## 42 28.7 55.5     33000         R     2
## 43 29.2 54.2     30000         R     2
## 44 28.5 54.1     30000         R     2
## 45 28.3 54.2     28000         R     2
## 46 24.0 55.2     22000         A     3
## 47 24.5 55.3     22000         A     3
## 48 24.6 55.8      6000         A     3
## 49 24.6 55.8      6000         R     3
## 50 24.2 54.4      6000         R     3
## 51 24.1 54.4      6000         R     3

Plot the troops

plot_troops <- ggplot(troops, aes(long, lat)) +
  geom_path(aes(size = survivors, color = direction, group = group))
plot_troops

Minard's Chart - Cities

## cities <- read.table(url("http://amitkaps.com/data/minard-cities.txt"), header = TRUE)
cities <- read.table("minard-cities.txt", header = TRUE)
cities
##    long  lat           city
## 1  24.0 55.0          Kowno
## 2  25.3 54.7          Wilna
## 3  26.4 54.4       Smorgoni
## 4  26.8 54.3      Moiodexno
## 5  27.7 55.2      Gloubokoe
## 6  27.6 53.9          Minsk
## 7  28.5 54.3     Studienska
## 8  28.7 55.5        Polotzk
## 9  29.2 54.4           Bobr
## 10 30.2 55.3        Witebsk
## 11 30.4 54.5         Orscha
## 12 30.4 53.9        Mohilow
## 13 32.0 54.8       Smolensk
## 14 33.2 54.9    Dorogobouge
## 15 34.3 55.2          Wixma
## 16 34.4 55.5          Chjat
## 17 36.0 55.5        Mojaisk
## 18 37.6 55.8         Moscou
## 19 36.6 55.3      Tarantino
## 20 36.5 55.0 Malo-Jarosewii

Add the cities to the troops plot

plot_troops_cities <- plot_troops +
  geom_text(aes(label = city), size = 4, data = cities)
plot_troops_cities

Put some polish - map projection

library(maps)
library(mapproj)

plot_polished <- plot_troops_cities +
  scale_size(range = c(1, 10), 
             breaks = c(1, 2, 3) * 10^5,
             labels = c(1, 2, 3) * 10^5 )+
  scale_color_manual(values = c("grey50","red")) +
  xlab(NULL) +
  ylab(NULL) +
  coord_map()

Put some polish - map

plot_polished

Continue to learn ggplot2

Visualization Assignment

You are working as a team member in a large global project to develop the digital ad strategy for your company. As part of the project, you need to provide an overview of the computing devices the consumers are likely to use to interact with these digital ads.

You have received a spreadsheet from an analyst about these computing devices. These computing devices are been tracked in three main categories - PCs (including desktops and laptops), Tablets and Smartphones. The data sheet includes historical and forecasted data on shipments (devices shipped to the consumer) and installed base (devices being used by the consumers) for these computing devices. In addition, you also have the same data segmented by Operating System (OS) being used on each of these devices.

The data sheet by the analyst is available at http://goo.gl/Zy6lcR

Visualization Assignment (contd.)

You need to develop a short data visualization for this data set and problem statement (using your preferred visualization and presentation tool). Please do use the data shared by the analyst, though you are free to enrich the same with any additional data or insights from external sources.

You will have 5 minutes to share this overview with the global project team as part of the next project discussion. Please prepare the visualizations accordingly.

Contact